RM2Target: a comprehensive database for targets of writers, erasers and readers of RNA modifications

Abstract RNA modification is a dynamic and reversible process regulated by a series of writers, erasers and readers (WERs). Abnormal changes of WERs will disrupt the RNA modification homeostasis of their target genes, leading to the dysregulation of RNA metabolisms such as RNA stability and translation, and consequently to diseases such as cancer. A public repository hosting the regulatory relationships between WERs and their target genes will help in understanding the roles of RNA modifications in various physiological and pathological conditions. Previously, we developed a database named ‘m6A2Target’ to host targets of WERs in m6A, one of the most prevalent RNA modifications in eukaryotic cells. To host all RNA modification (RM)-related WER–target associations, we hereby present an updated database, named ‘RM2Target’ (http://rm2target.canceromics.org/). In this update, RM2Target encompasses 1 619 653 WER–target associations for nine RNA modifications in human and mouse, including m6A, m6Am, m5C, m5U, m1A, m7G, pseudouridine, 2′-O-Me and A-to-I. Extensive annotations of target genes are available in RM2Target, including but not limited to basic gene information, RNA modifications, RNA–RNA/RNA–protein interactions and related diseases. Altogether, we expect that RM2Target will facilitate further downstream functional and mechanistic studies in the field of RNA modification research.


INTRODUCTION
RNA modifications have been known to play critical roles in modulating RNA metabolisms, such as gene transcription, RNA stability, RNA splicing, nuclear localization and translation (1). RNA modifications are dynamically mediated by three different classes of proteins: writers, erasers and readers (WERs). Writers can regulate the deposition of RNA modifications; e.g., METTL3 and NSUN2 can write N6-methyladenosine (m6A) and 5-methylcytosine (m5C) to their target genes, respectively (2,3). Erasers remove RNA modifications from target genes; e.g., FTO is an eraser for m6A and ALKBH3 for N1-methyladenosine (m1A) (4,5). Readers serve their functions by recognizing RNA modification sites in target genes; e.g., YTHDF1/2/3 are readers for m6A (6) and ALYREF is a reader for m5C (7). The dysregulation of WERs has been proved to be associated with various diseases, including cancer, cardiovascular diseases and neurological diseases (8)(9)(10). The same WER may have distinct functions under different conditions. For example, METTL3 plays an oncogenic role in most cancer types (11,12), but it has also been reported to have tumorsuppressive functions in certain cancer types, such as kidney cancer (13). This is mainly because perturbation of a WER may selectively affect different sets of target genes in different conditions. Taken together, identifying WER-target associations is particularly important for studying the functions and regulatory mechanisms of RNA modifications in various physiological and pathological conditions. However, there is yet no public repository to host WER-target associations across different RNA modifications.
Benefitting from the development of experimental and high-throughput sequencing technologies, more evidence could be provided to explore the relationship between WER and target genes. For instance, immunoprecipitation (14) and certain next-generation sequencing tech-D270 Nucleic Acids Research, 2023, Vol. 51, Database issue nologies such as RIP-seq (15) and CLIP-seq (16) can directly and effectively elucidate the binding relationship between WER proteins and target RNAs. Perturbation techniques such as small interfering RNA (siRNA) (17) and CRISPR/Cas9 system (18) together with high-throughput sequencing could be used to systematically evaluate the perturbation effects of WERs on the RNA metabolisms of specific genes. In 2020, we developed m6A2Target, a comprehensive resource for targets of m6A writers, erasers and readers (19). m6A2Target is the first resource focused on the WER-target associations and has been widely used since its publication. Over the past 2 years, there has been an explosion of studies related to RNA modifications. With the discovery of more WERs, the number of target genes has also expanded, necessitating an update of m6A2Target.
In this study, we present an updated database called RM2Target (http://rm2target.canceromics.org/) to host WER-target associations for 63 WERs from two organisms and nine types of RNA modifications, including N6-methyladenosine (m6A), N6,2 -O-dimethyladenosine (m6Am), N1-methyladenosine (m1A), pseudouridine (), 5-methyluridine (m5U), 5-methylcytosine (m5C), 7-methylguanosine (m7G), 2 -O-methylation and A-to-I RNA editing. Compared to m6A2Target, RM2Target is far superior in all aspects (Supplementary Table S1). RM2Target encompasses 1 619 653 WER-target associations in different cell lines or tissue types derived from published literature and public high-throughput datasets. All records are categorized into three evidence types, namely 'validated targets' collected from low-throughput methods, 'binding targets' inferred from high-throughput methods such as iCLIP-seq, eCLIp-seq, HITS-CLIP-seq, PAR-CLIP-seq, RIP-seq and ChIP-seq and 'perturbation targets' inferred from WERs perturbation followed by high-throughput sequencings such as RNA-seq, MeRIPseq and Ribo-seq ( Figure 1). To allow users to further study the functions of WERs and target genes and investigate the relationship between RNA modifications and diseases, RM2Target provides basic gene information as well as abundant annotations, including RNA modifications, RNA-RNA/RNA-protein interactions, expression correlation between WERs and their target genes in TCGA cancer types, and RNA-disease associations. Overall, we expect that RM2Target will be a valuable resource for researchers who are aiming at exploring the regulatory principles of WERs or finding certain therapeutic targets.  Table S3). All experimental details and downstream effects were integrated manually. (ii) 'Binding': 72 datasets about protein-DNA interactions, protein-RNA interactions and protein-protein interactions from ChIP-seq, RIP-seq, CLIP-seq and mass spectrometry were collected. Further analyses were performed to identify the binding targets of different WERs in different cell lines or tissues. (iii) 'Perturbation': 285 datasets were collected under different perturbation conditions, such as knock-out, overexpression and mutation. Different analyses were performed to obtain potential targets changed at the level of gene expression, modification, translation, alternative splicing or stability according to the sequencing method, like RNA-seq, MeRIP-seq or Ribo-seq (Supplementary Table  S4).

Data collection and quality control
For all datasets, Genome Reference Consortium Human Build 38 (hg38) and Genome Reference Consortium Mouse Build 39 (mm39) were used as the reference genomes for human and mouse, respectively. fastp (21) was employed to perform quality control and preprocessing of all raw data.

Derivation of potential targets with binding evidence
To obtain potential targets that have protein-protein interactions with WERs, mass spectrometry results were summarized from the supplementary data of relevant literature.
To obtain potential targets that have protein-DNA interactions with WERs, preprocessed ChIP-seq data were aligned to the reference genome using bowtie2 (22), and peaks were called using MACS2 (23). HOMER (24) was used to annotate peaks to genes. All genes with ChIPseq peaks were considered as potential WER targets with protein-DNA binding evidence.
To obtain potential targets that have protein-RNA interactions with WERs, RIP-seq and various types of CLIPseq data were collected. For RIP-seq, STAR (25) was used for read alignment and then RSEM (26) was used to obtain the count and FPKM matrix. Genes with a fold change between IP and INPUT greater than 2 (IP/INPUT > 2) were considered as potential binding targets. For various types of CLIP-seq, bowtie2 (22) was first invoked to do the read mapping. Then, CLIP Tool Kit (27) was used to identify the potential binding sites for iCLIP-seq, eClip-seq and HITS-CLIP-seq, and PARalyzer (28) was used for PAR-CLIPseq. HOMER was used to annotate all binding sites. All genes with binding sites were considered as potential WER targets with protein-RNA binding evidence.

Derivation of potential targets with perturbation evidence
To obtain potential targets with perturbation evidence, differential analyses were performed between control group versus perturbed group at five levels: RNA expression, modification, translation, alternative splicing and stability. Group information for all analyses can be found in the Supplementary Table S3.
For RNA expression, preprocessed RNA-seq data were aligned to the reference genome using STAR (25). RSEM Figure 1. Overall design and construction of RM2Target. WER-target associations deposited in RM2Target were categorized into three evidence types, namely 'validated targets' derived from manual curation, 'binding targets' inferred from binding region analysis and 'perturbation targets' predicted from differential analysis. RM2Target provides basic gene information as well as abundant annotations, including RNA modifications, RNA-RNA/RNAprotein interactions, expression correlation between WERs and their target genes in TCGA cancer types, and RNA-disease associations. (26) was then used to generate gene counts, and DESeq2 (29) was used to obtain the differentially expressed genes (P-value < 0.05, fold change > 1.5). For modification enrichment, MeRIPseqPipe (30) was used to automatically analyze MeRIP-seq data, where MACS2 and MeTPeak (31) were used for m6A peak calling and DESeq2 was set to compare the modification levels (P-value < 0.05, fold change > 1.5). For m5C BS-seq, information about the modification sites and quantification results were downloaded from relevant GEO datasets. For RNA stability, half-life of mRNA was calculated according to a previously published paper (32). Genes with a fold change in half-life between perturbed group and control group > 1.2 were considered as potential targets. For translation efficiency, Ribo-seq data were filtered for mitochondrial DNA and ribosomal RNA using bowtie2 (22), followed by read alignment using STAR (25). RSEM (26) was used to obtain the count and FPKM matrix of genes. The difference between two groups was calculated using log 2 (FPKM of perturbed group/FPKM of control group), with a cutoff of 1.5 for the fold change. For alternative splicing, rMATS (33) was used to detect different alternative splicing events. The differential alternative splicing events (FDR < 0.01) were considered as potential targets.

Motif analysis of target genes
To identify the consensus motifs of target genes for each WER, we use the top 200 most reliable targets deter-mined by confidence score to perform motif analysis with HOMER (24). Only targets with binding peaks detected by CLIP-seq or modification peaks detected by MeRIP-seq among top 200 targets will be involved in this analysis. Related peak clusters were set as the target sequences, and a set of background clusters was generated with shuffleBed program from BEDTools (34) to randomly shuffle regions of the same size as the clusters throughout the gene regions. The parameter of motif length was set as 5, 6, 7,8. Top 10 significant motifs (P-value < 0.01) were used and displayed in the web page.

Annotation of WER-target associations
Basic information of WERs and target genes, such as official gene symbol, gene ID, gene type and genome location, were preferentially extracted from GENCODE (35) annotation files (human: v40, mouse: vM29). Ensembl (36), UCSC (37) and GtRNAdb (38) databases were used for information supplementation. Deprecated or substituted versions of genes were filtered out. Experimental details and sample information were obtained from the source literature or relevant datasets.
To better illustrate the WER-target associations, modification sites in RMVar (39), RMBase 2.0 (40) and m5C-Atlas (41) were collected to predict the role of RNA modifications in the regulatory relationship between WERs and targets. ENCORI (42), POSTAR3 (43) and NPinter4.0 (44) were used to predict the RNA-RNA or RNA-protein in-teractions of target genes to provide supplementary evidence of WER-target associations. Additionally, to explore the relationship between targets and diseases, TCGA (45) gene expression data were integrated into RM2target, allowing users to investigate the correlation between the expression of WERs and their targets in any cancer type of interest. Furthermore, RNA-disease associations with experimental evidence were collected from OMIM (46), Dis-GeNET (47), RNADisease (48), LncRNADisease2.0 (49) and CSCD2.0 (50). For disease-associated variants, data were collated from DisGeNET (47) and ClinVar (51), and for cancer-associated variants, from COSMIC (52), ICGC (53) and TCGA (45). These annotations may deliver new insights into the underlying pathogenesis from the perspective of WER-target associations. To unify all results, the genomic coordinates of all data resources were further converted to hg38 or mm39 using the LiftOver program (37). The gene symbols were also converted using the standard procedures.

Database and web interface implementation
MySQL was used to store and manage all data in RM2Target. The server-backend development was based on java and the web-frontend interfaces were implemented in Hyper Text Markup Language (HTML), Cascading Style Sheets (CSS) and JavaScript (JS). All the interactive diagrams were generated by ECharts to visualize the analysis results. Furthermore, RM2Target implemented a genome browser to present genomic annotations using UCSC Genome Browser (https://genome.ucsc.edu/) (37).

Database content
Currently, RM2Target includes a total of 1 619 653 WERtarget associations covering 63 WERs and nine RNA modifications from human and mouse (Table 1). Three evidence types of WER-target associations were deposited in RM2Target: (i) 'Validated': 1530 and 584 WER-target associations for human and mouse were validated by in vivo or in vitro experiments using western blot, RT-qPCR, RNA stability assay, RIP assay, luciferase reporter assay, etc. methods, respectively. (ii) 'Binding': 461 746 and 61 395 WER-target associations with binding evidence were predicted by high-throughput analyses for human and mouse, respectively. Among them, there were 451 016 WER-target associations with protein-RNA interactions for human and 55 004 for mouse. A total of 3100 WERtarget associations with protein-protein interactions were recorded in human. Growing studies have demonstrated that WERs may localize to the different region of chromatin and subsequently affect the deposition of RNA modifications co-transcriptionally (54)(55)(56). Therefore, RM2Target also included 7630 and 6391 WER-target associations with protein-DNA interactions in human and mouse, respectively, which may uncover another layer of gene expression regulation from the perspective of the transcription processes. (iii) 'Perturbation': 646 539 and 447 859 WERtarget associations were inferred from WERs perturbation for human and mouse, respectively. Previous studies have  shown that RNA modifications play vital roles in certain fundamental biological processes, such as mRNA stability (57), splicing (58) and translation (59). Consequently, RM2Target performed differential analyses at five levels, namely expression, modification, translation, stability and alternative splicing. For human, there were 339 604 WERtarget associations with RNA level changes, 60 176 with modification level changes, 117 866 with translation efficiency changes, 27 700 with mRNA stability changes and 101 193 with differential alternative splicing events. For mouse, there were 243 043 WER-target associations with RNA level changes, 38 107 with modification level changes, 90 067 with translation efficiency changes, 31 084 with mRNA stability changes and 45 558 with differential alternative splicing events. We observed a highly significant proportion of 89% of WER-target associations validated either by 'Binding' or 'Perturbation' evidence, indicating the reliability of WER-target associations in RM2Target (Figure 2A). Due to the analysis strategy of high-throughput sequencing data, the target genes are mainly mRNAs and lncRNAs ( Figure 2B).

Complex regulatory network among WERs and target genes revealed by RM2Target
Crosstalk between different WERs. The writers, erasers and readers of the same RNA modification often cooperate to control cellular phenotypes. For example, a recent study showed that the collaboration of a set of WERs of m6A ('writer: METTL14', 'eraser: ALKBH5' and 'reader: YTHDF3') regulates cancer growth and progression (60). Moreover, increasing evidence indicates a frequent crosstalk among different RNA modifications. For example, NSUN2-mediated m5C modifications can promote METTL3-mediated m6A modifications, and vice versa (61). m6A modifications will inhibit A-to-I editing level probably by blocking the binding of ADAR and target genes (62). We systematically analyzed the data collected in RM2Target to explore the crosstalk between different WERs across different RNA modifications. We observed intensive crosslinks between target genes of WERs from different modifications ( Figure 2C). As expected, the target genes of m6A writers METTL3 and METTL14 have a high degree of overlap ( Figure 2D), consistent with the fact that METTL3 and METTL14 often function as a heterodimer complex. Additionally, FBL, NOP56 and NOP58, writers of 2 -O-methylation, which are proved to act as the core proteins of box C/D small nucleolar ribonucleoprotein complexes (snoRNPs) (63), also share similar sets of target genes.
Crosstalk between RNA modifications and gene transcription. The deposition of RNA modifications is usually considered as a co-transcriptional process (54)(55)(56). More recently, increasing studies revealed that RNA modifications can also regulate gene transcription (64,65). By integrated analysis of peak sites from the ChIP-seq data of WERs and RNA modification sites, we found that the ChIP-seq peak sites of m6A WER, such as METTL3, YTHDC1, were close to corresponding m6A sites, and the peaks sites of m1A WERs, such as ALKBH3 was close to correspond-ing m1A sites ( Figure 2E), suggesting a strong crosstalk between these RNA modifications and gene transcription. We calculated the average distance between the ChIP-seq peak center of WER and nearest RNA modification sites and considered this average distance indicated the crosstalk between RNA modification and gene transcription. As a result, we observed that m6A modification is mostly involved in the crosstalk and m5C is least involved in the crosstalk ( Figure 2F).

RNA modification independent WER-target associations.
Recent studies showed that some known WERs of RNA modification have modification-independent functions. For instance, the m6A writer METTL3 could promote the translation of target gene PAPBC1 without m6A modification (66). The m6A writer METTL16 exerts an m6Aindependent function to facilitate translation and tumorigenesis (67). We systematically investigated the RNA modification dependent and independent WER-target associations using RM2Target data. We found that only 30% target genes have corresponding RNA modification sites from public databases such as RMbase and RMVar ( Figure 2G), which suggests that WERs might have RNA modification independent functions. However, the possibility that the current methods for the detection of RNA modifications may miss many RNA modification sites for various reasons cannot be ignored.

Web interface and usage
Compared to m6ATarget, the web interface of RM2Target has been redesigned to allow users to browse, search, and download all available RNA modification WER and target gene information more conveniently.
Browse and search, in the browse page ( Figure 3A), users can browse WER-target associations by WERs of interest. Firstly, the basic information for the selected WER was provided. A word cloud map was then provided to visualize the target genes of the selected WER. The size of each WER was determined by the confidence score, which was defined as the sum of supported literature/dataset number of the three evidence types. Motifs can be used to uncover the biological rules of binding targets. To identify the consensus motif of individual WERs, we use the top 200 most reliable targets determined by confidence score as described above to perform motif analysis. The identified consensus motifs of selected WER were shown in the 'Browse' page. As expected, the classical m6A motif 'RRACH' (R represents A or G, and H represents A, C or U) was detected in the corresponding m6A writers, erasers and readers in both human and mouse (Supplementary Figure S1A, B), and 77-99% of the targets contain the corresponding motifs (Supplementary Figure S1C). A browse table was provided to show the target genes of selected WER. We provide advanced filter functions in the browse table for users to further obtain more specific queries by combining conditions. Moreover, to help the user to filter the targets from noises, we provide confidence score, number of validated, binding and perturbation evidence, and motif status in the browse table. The targets were sorted by the confidence score. In the search page ( Figure 3B), users can search WER-target gene associ- Detail, by clicking on the RM2Target ID in the browse or search result table, a detail page containing the detail information of the WER-target association in a cell line/tissue type was shown. In the detail page, the details of different evidence types including 'Validated', 'Binding' and 'Perturbation' were provided ( Figure 3C). In the 'Validated evidence' section, literature sources, cell line/tissue information, experimental methods, and the effect of WER on target genes are provided. In the 'Binding evidence' section, the binding type such as 'protein-RNA' and 'protein-DNA', sequencing method, annotation of binding sites, and links to source data were provided. In the 'Perturbation evidence' section, the direction of WER perturbation and perturba-tion effects on target genes at the level of gene expression, translation, modification, stability and alternative splicing were provided. In addition, RM2Target provides many annotations for the target genes, including RNA modification sites, gene associated interactions and related diseases. Besides, the information of gene expression correlation between WER and target gene in cancers calculated from TCGA data were also integrated. By clicking on the analysis ID in the detail page, users can get the information on dataset information, analysis parameters, sample information, and analysis results.
RM2Target allows users to download all validated and potential targets in human and mouse. Detailed guidance on the usage of RM2Target can be found on the 'Help' page.

Application of RM2Target in cancer studies
The m6A writer METTL3 has been proved to be associated with cancer progression in a variety of cancer types (68).
Here, we showed how to explore the function of METTL3 in liver cancer using RM2Target. TCGA data shows METTL3 expression is significantly higher in tumor tissues than normal tissues in LIHC ( Figure 4A) and METTL3 expression is significantly associated with shorter overall survival in LIHC patients ( Figure 4B). We next used RM2Target to find potential targets of METTL3 and its m6A effector in liver cancer to provide candidates for further functional and mechanism studies. Since IGF2BP1 has most target records among m6A readers in HepG2, a liver cancer cell line, we selected IGF2BP1 as the potential m6A effector in liver cancer. We observed a significant overlap between METTL3 target genes and IGF2BP1 target genes in HepG2 ( Figure 4C). Moreover, there was also a large overlap between genes significantly correlated with METTL3 expression and genes significantly correlated with IGF2BP1 expression in TCGA LIHC gene expression data ( Figure 4C). 485 genes were overlapped between RM2Target results and TCGA results, which are candidate target genes of both METTL3 and IGF2BP1 ( Figure 4C). The 485 target genes were enriched in many cancer-related pathways, such as P53 signaling pathway and Apoptosis ( Figure 4D). Next, we performed survival analysis using univariate Cox model and found 36 targets significantly associated with overall survival ( Figure 4E). We then used RM2Target to browse the detailed information of these targets. We take ASNS (asparagine synthetase (glutamine-hydrolyzing)) as an example (http://rm2target. canceromics.org/#/detail/RM2Target 260294; http: //rm2target.canceromics.org/#/detail/RM2Target 900943). We noted that ASNS was associated with cancer and malignant neoplasm of liver according to the disease annotations in RM2Target ( Figure 4F). The expression level of ASNS was significantly downregulated upon METTL3 knockdown ( Figure 4G), and the expression level of METTL3 and ASNS showed positive correlation in TCGA LIHC data ( Figure 4H). The expression level of ASNS was significantly downregulated upon IGF2BP1 knockdown ( Figure 4I), and the expression level of IGF2BP1 and ASNS showed positive correlation in TCGA LIHC data ( Figure 4J). Additionally, ASNS harbours many m6A modification sites, and IGF2BP1 can bind to ASNS according to the POSTAR3 annotations collected in RM2Target. The high expression of ASNS was associated with poor prognosis according to the TCGA LIHC data ( Figure  4K). Taken together, the above results suggest that ASNS may be a reliable target gene of METTL3/IGF2BP1, and METTL3/m6A(ASNS)/IGF2BP1 axis may serve as a novel promising therapeutic target for liver cancer, which deserved to be investigated in depth.

SUMMARY AND PERSPECTIVES
Determining the regulatory network of WERs will be of great significance in the understanding of the molecular mechanisms underlying the regulation of RNA modifications and exploration of the potential application value of RNA modifications. To our knowledge, RM2Target is the first comprehensive and specific resources focusing on the targets of different WERs across various RNA modifications.
Our previous version m6A2Target received much attention and facilitated a lot of m6A related research. For examples, the researchers use m6A2Target to explore the tumorigenic mechanisms of m6A WERs in various cancer types (68,69). Compared to m6A2Target, RM2Target covers more types of RNA modifications and collects more WER-target gene associations which expands nearly 4 times more than m6A2Target. Using RM2Target to retrieve target genes for WERs of interest can help researchers conveniently and quickly identify the candidate genes, further promote the study of regulatory mechanisms. Moreover, researchers can explore the potential functions of RNA modifications with the perspective of crosstalk between different WERs and between different RNA modifications as mentioned above. However, RM2Target also has limitations. First, we only collected data from the two most frequently used organisms: human and mouse. Second, the target gene type and evidence type are not comprehensive enough. Currently, RM2Target does not provided WER-target associations for the target gene types such as circRNA and small RNAs. For the perturbation targets, we only focused on the five most studied downstream effects such as expression, modification, translation, stability, and alternative splicing. In addition, the targets were mainly derived from short read sequencing data. Long reads sequencing technology is a new trend for modification detection, which is not included in the current version. We will attempt to fix the above limitations in the future updates of RM2Target.
In conclusion, we expect that RM2Target will become a powerful resource for the field of RNA modification research. As a long-term goal, we would like to continually update RM2Target.

DATA AVAILABILITY
RM2Target is a comprehensive online database available at http://rm2target.canceromics.org.

SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online.